ON THE LEARNING ALGORITHM OF 2-PERSON ZERO-SUM MARKOV GAME WITH EXPECTED AVERAGE REWARD CRITERION
نویسندگان
چکیده
منابع مشابه
On the equivalence of two expected average reward criteria for zero-sum semi-Markov games
In this paper we study two basic optimality criteria used in the theory of zero-sum semi-Markov games. According to the first one, the average reward for player 1 is the lim sup of the expected total rewards over a finite number of jumps divided by the expected cumulative time of these jumps. According to the second definition, the average reward (for player 1) is the lim sup of the expected to...
متن کاملReinforcement Learning for Average Reward Zero-Sum Games
We consider Reinforcement Learning for average reward zerosum stochastic games. We present and analyze two algorithms. The first is based on relative Q-learning and the second on Q-learning for stochastic shortest path games. Convergence is proved using the ODE (Ordinary Differential Equation) method. We further discuss the case where not all the actions are played by the opponent with comparab...
متن کاملMarkov decision evolutionary games with time average expected fitness criterion
We present a class of evolutionary games involving large populations that have many pairwise interactions between randomly selected players. The fitness of a player depends not only on the actions chosen in the interaction but also on the individual state of the players. Players stay permanently in the system and participate infinitely often in local interactions with other randomly selected pl...
متن کاملBounded Parameter Markov Decision Processes with Average Reward Criterion
Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in the parameters of a Markov Decision Process (MDP). Unlike the case of an MDP, the notion of an optimal policy for a BMDP is not entirely straightforward. We consider two notions of optimality based on optimistic and pessimistic criteria. These have been analyzed for discounted BMDPs. Here we pro...
متن کاملA Two-person Zero-sum Game with Fractional Loss Function
In this paper, we investigate a two-person zero-sum game with fractional loss function, which we call the two-person zero-sum fractional game. We are interested in a game value and a saddle point of the two-person zero-sum game and observe that they exist under various conditions. But, in many cases, it seems to be difficult to search directly for a game value and a saddle point of the fraction...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bulletin of informatics and cybernetics
سال: 1985
ISSN: 0286-522X
DOI: 10.5109/13364